Erfan Ayyobi; Kamyar Mansouri; Mohammad Golmahi; Ozra Ramezan khani; Alireza Mosavi Jarrahi
Volume 22, Special Issue , March and April 2016, , Pages 1158-1171
Abstract
Research development and information technology progress lead to generate big dataset with valuable information. In health research, with tracing people from different dataset like registries can provide valuable information about prognosis, prediction, discrimination, detection or etiology for many ...
Read More
Research development and information technology progress lead to generate big dataset with valuable information. In health research, with tracing people from different dataset like registries can provide valuable information about prognosis, prediction, discrimination, detection or etiology for many outcomes without establishing costly studies. Extracting the knowledge from this potential information is applied using advanced methods such as data linkage or record linkage with deterministic or probabilistic algorithm. However, probabilistic linkage is computationally complex and not well understood by many researchers who may wish to apply it in their work. Therefore, the purposes of this review article is to introduce probabilistic record linkage methodology such as quality and standardization of dataset, determining the matching records from different dataset, calculating the matching weights and discrimination matched from unmatched record using a cut point. In follow, with a practical example the probabilistic record linkage methodology is introduced by cancer registry and mortality dataset.